A Model for Integrating the Publication and Preservation of Journal Articles
نویسنده
چکیده
There are policy, technical, and workflow gaps in library efforts to preserve online journal literature. Since libraries are increasingly involved in journal publishing, HathiTrust, a shared preservation-quality digital repository, is a natural place to archive and provide access to journal literature to ensure its long-term preservation and discoverability. The U-M Library is funding the creation of mPach, an open-source, end-to-end publishing system in which archiving in HathiTrust happens as a byproduct of publication rather than being carried out after the fact. The architecture of mPach, its envisioned workflow, and plans for creating a shared infrastructure for publishing open-access journals are all summarized. 1 The deficit in journal preservation Until quite recently, publishers produced documents on physical media, and libraries acquired and preserved copies of these documents. But in the era of the Internet, when publishers host content online, the library’s role in acquiring and preserving the content is in jeopardy: without special licensing arrangements such as those often provided by open-access journals, a library has no legal right to make a copy of the content for preservation. Various business models have evolved to address this situation, especially for journals, which are increasingly available only online. For non-open-access journals, research libraries often negotiate the right to create a digital copy of any content acquired during the period of subscription [1] and make this content available only to their patrons [2], though few are equipped to provide this kind of restricted access and archiving with integrated browse and search functions. To address the more pressing concern of publishers going out of business without any libraries holding a copy of the content, libraries and publishers have collaborated in initiatives like LOCKSS [3], CLOCKSS [4], and Portico [5] in order to guarantee that one or more copy of the content will become available if it is no longer available from the publisher. Similarly, the Koninklijke Bibliotheek and Elsevier reached an agreement in 2002 whereby the KB will preserve Elsevier journals under terms similar to those governing journals that use LOCKSS, CLOCKSS, and Portico [6]. Still, there are problems with these models. LOCKSS and CLOCKSS use web crawling, which captures only the appearance of webpages but not their underlying structure or search functionality. Portico and the KB, on the other hand, rely on publishers to deliver journal articles in valid file formats, and not just the version first published but also any corrected versions of these articles. One way to ensure that a library always has access to the latest content is for the library to operate the very system used to publish the journal. A survey in 2010 of a cross-section of North American academic libraries found that, of 144 responding institutions, 43 offered “operational publishing services” to their scholars at the institution [7]. Of these 43 institutions, most host publications using open-source software such as Open Journal Systems (OJS) [8] or DSpace [9], while about a quarter use Digital Commons [10], a hosted platform provided by bepress. Unfortunately, all of these platforms deliver to users only those files (primarily PDF files) created and uploaded by a journal editor. Since the library is not in a position to control the software and workflows used to create these files, the library can only provide bitwise preservation of the files, severely hampering future migration of the content. 2 A higher standard for preservation Since libraries are increasingly involved in journal publishing, HathiTrust [11], a shared preservationquality digital repository, is a natural place to archive and provide access to journal literature to ensure its long-term preservation and discoverability. HathiTrust already archives and provides access to reformatted library holdings, but the University of Michigan Library, a founding member of HathiTrust, sees an opportunity to use HathiTrust for publishing borndigital journals as well. To develop an infrastructure in support of low-cost university-based publishing that addresses the needs and values of both content creators and librarians, the U-M Library is funding the creation Proceedings of the 15th All-Russian Conference "Digital Libraries: Advanced Methods and Technologies, Digital Collections" ― RCDL-2013, Yaroslavl, Russia, October 14-18 2013. Figure 1: Major parts of mPach of mPach [12], an open-source, end-to-end publishing system in which the act of publishing and the act of archiving are unified. In other words, archiving in HathiTrust happens as a byproduct of publication rather than being carried out after the fact. mPach leverages existing components of HathiTrust and available opensource software where appropriate. Archiving is not as simple as saving a copy of a file produced by a journal editor, as OJS and institutional repositories generally do. Instead, the content needs to be stored in a format that allows digital preservation. PDF/A, a non-proprietary variant of the PDF family standardized as ISO 19005, is often suggested for such needs, but even a PDF/A file is poorly suited for use with screen readers for the visually impaired and for any non-paginated display, and is suboptimal even for searching and data mining. Rather than preserving the paginated appearance of a document, the text of the article needs to be stored in a format that reflects its structure and semantics, with associated media in formats that can be preserved and rendered. mPach has developed a specification for journal articles that uses the Journal Article Tag Suite (JATS), an application of NISO Z39.96-2012 [13], for the text and stores this with high-quality versions of media objects and with a METS record containing structural and preservation metadata. 3 An overview of mPach There are three major parts of mPach (see also figure 1), each of which includes components in various stages of development at the time of writing: • the peer review and editorial system: what authors and reviewers interact with • Prepper: what prepares the article for ingest into HathiTrust for archiving and publication • modified HathiTrust components: various modifications to existing components of the HathiTrust environment to support born-digital journal articles As a modular system, mPach could be used with any peer review and editorial system that is capable of interacting with Prepper; however, the developers have chosen to provide OJS as the default option. Despite having no support for digital preservation, OJS is already widely used for library-based journal publishing, and mPach’s integration with this software will allow for a smooth transition of journals already published using OJS into the HathiTrust repository. Integration with mPach requires that manuscripts that reach the “layout” stage in OJS be sent to Prepper, which prepares the HathiTrust Submission Information Package (SIP). Prepper provides a user interface for the editor of a journal: a dashboard for administering the journal and putting manuscripts through a production process—akin to composition and typesetting—that prepares all content according to the preservation standard developed for mPach content in HathiTrust. Prepper invokes Norm, a Python application developed to convert manuscripts from Office Open XML (“DOCX”) format [14] into XML that conforms to JATS. DOCX is the default option because, like OJS, it is widely used in the editorial process of journals published by libraries. The Prepper interface also guides the staff member through a review of validation errors detected by Norm’s conversion, uploading highresolution figures, supplying “alt text” for figures, previewing the article as rendered using the default stylesheet (based on the Preview XSLT stylesheets [15]), uploading supplementary material [16], and submitting for ingest into HathiTrust. mPach requires a number of significant modifications to HathiTrust components and workflows Figure 2: Mockup of an article viewed in HathiTrust’s user interface originally designed to support reformatted print materials. The reading interface in HathiTrust, which previously supported only rendering of digitized page images, renders JATS XML in HTML and allows a user to download a dynamically generated PDF and EPUB, display metadata specific to articles (figure 2), and link to a special “collection” for the journal in HathiTrust’s Collections application [17] that allows for browsing volumes and issues of the journal (figure 3). Discovery of known items in HathiTrust using metadata like title and author is currently provided for by a catalog of MARC records, with one per item in the repository. For mPach, each article has its own analytic catalog record, tied to a monographic record for the journal as a whole. Finally, the HathiTrust Data API [18] allows for the content of each article to be retrieved for use outside of the native HathiTrust interface. Note that by policy HathiTrust only closes access to content for legal reasons, not because a rightsholder wants to restrict access. Therefore, mPach only supports Figure 3: Mockup of a journal viewed in HathiTrust’s user interface the publishing of open-access journals.
منابع مشابه
Seven years publication of “Iranian Journal of Radiation Research” with confident but cautious steps (Editor\'s Commentary)
The Iranian Journal of Radiation Research (IJRR) is now in the eighth year of publication. This journal is the mouth piece of shared idea of Dr Shahram Akhlaghpoor and me, which was established way back in 2002. At that time the main emphasis of the founder members was to make the subject of radiation research attractive and interesting especially for combating cancer and risk assessment. T...
متن کاملچرا مقالات زیست پزشکی ایرانیان بازپسگرفته میشوند؟
Introduction: Retraction of articles occurs as a result of scientific misconducts or honest errors. The present study aimed to examine retracted articles on PubMed database written by Iranian authors in biomedicine. Methods: In this descriptive cross-sectional study, all retracted articles on PubMed database written by Iranian authors were retrieved using the following keywords; Iran [AD] AND ...
متن کاملLetter of Editor-in-Chief
Dear Colleagues, As Editor-In-Chief of Journal of Cardio-Thoracic Medicine (JCTM), it is my pleasure to introduce this official quarterly publication of Mashhad University of Medical Sciences, which is an international, peer-reviewed, and English language journal. Our aim is to publish high quality integrating clinical and experimental research in the following fields: Respiratory medici...
متن کاملThe publication status and general quality of internationally published articles by Iranian nursing scholars
Background and Purpose: One of the most reliable methods to evaluate the scientific status of nursing is the assessment of the trend and quality of related articles. This study aimed to determine the publication status and general quality of articles published by Iranian nursing scholars engaged in different nursing and midwifery schools in well-known international journals during 2000-2011. Me...
متن کاملدومین خود ارزیابی روند چاپ مقالات در فصلنامه علمی – پژوهشی دانشگاه علوم پزشکی رفسنجان
Background and Objectives : Continuous appraisal of the publication process of articles is warranted for developing future research activities. The aim of the present study was the determination of the publication process of articles published in the Journal of Rafsanjan University of Medical Sciences (JRUMS). Materials and Methods : In this descriptive study, all published papers in the JR...
متن کاملبررسی فاصله زمانی انجام تحقیق، تهیه مقاله، اعلام وصول، تائیدیه، انتشار و عوامل مرتبط با آن در مجلات علمی پژوهشی دندانپزشکی کشور در سالهای 1390-1389
Abstract Background: Long delays in getting a paper published in a medical journal may affect motivation for research. The aim of this study was to evaluate the duration of the research, and the average interval between the completion of a research project and the publication of the research article in dental research journals. Materials and methods: In this cross sectional study, all eligi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013